## IN THE CLAIMS:

Please add claims 26-28, and please amend claims 7, 10, 11, 14, 15, and 22, as indicated below.

- 1. (Original) A floating point multiplier circuit configured for performing extended-precision multiplication of an N-bit multiplicand value by an M-bit multiplier value, wherein N and M are positive integers, said floating point multiplier circuit comprising:
  - partial product generation logic configured to generate a plurality of partial products from said multiplicand value and said multiplier value, wherein said plurality of partial products corresponds to a first portion of said multiplier value during a first partial product execution phase, and wherein said plurality of partial products further corresponds to a second portion of said multiplier value during a second partial product execution phase;
  - a plurality of carry save adders coupled to said partial product generation logic and configured to accumulate said plurality of partial products generated during said first partial product execution phase into a redundant product during a first carry save adder execution phase, and further configured to accumulate said plurality of partial products generated during said second partial product execution phase into said redundant sum during a second carry save adder execution phase; and
  - a first carry propagate adder coupled to said plurality of carry save adders and configured to reduce a first portion of said redundant product to a multiplicative product during a first carry propagate adder phase, and further configured to reduce a second portion of said redundant product to said multiplicative product during a second carry propagate adder phase;

- wherein said first carry propagate adder phase begins after said second carry save adder execution phase completes.
- 2. (Original) The floating point multiplier circuit as recited in claim 1, wherein:
- said plurality of carry save adders is further configured to perform an arithmetic left shift on said redundant product accumulated during said first carry save adder execution phase by a number of bits corresponding to said first portion of said multiplier value; and
- said plurality of carry save adders is further configured to accumulate a result of said arithmetic left shift with said second portion of said plurality of partial products into said redundant product during said second carry save adder execution phase.
- 3. (Original) The floating point multiplier circuit as recited in claim 1, wherein:
- said first portion of said multiplier value corresponds to a higher-order portion of said multiplier value;
- said second portion of said multiplier value corresponds to a lower-order portion of said multiplier value;
- said first portion of said redundant product corresponds to a lower-order portion of said redundant product; and
- said second portion of said redundant product corresponds to a higher-order portion of said redundant product.
- 4. (Original) The floating point multiplier circuit as recited in claim 1, wherein:

said redundant product includes a Q-bit sum term and an R-bit carry term;

said first carry propagate adder includes a plurality of operand inputs, wherein each operand input includes at most P bits; and

each of P, Q, and R is a positive integer, P is less than Q, and P is less than R.

- 5. (Original) The floating point multiplier circuit as recited in claim 1 further comprising a plurality of rounding adders coupled to said plurality of carry save adders and configured to produce a respective plurality of rounded multiplicative products.
- 6. (Original) The floating point multiplier circuit as recited in claim 5, wherein each rounding adder is further configured to:

receive a respective rounding constant;

- accumulate said respective rounding constant with a first portion of said redundant product into a rounded redundant product during said first carry propagate adder phase;
- reduce a first portion of said rounded redundant product to a given respective rounded multiplicative product during said first carry propagate adder phase; and
- reduce a second portion of said rounded redundant product to said given respective rounded multiplicative product during said second carry propagate adder phase.
- 7. (Currently amended) The floating point multiplier circuit as recited in claim 1, further configured for performing pipelined reduced-precision multiplication of said N-bit multiplicand value by an S-bit multiplier value with a single partial product

execution phase, a single carry save adder execution phase, and a single carry propagate adder phase, wherein S is a positive integer and S is less than or equal to N/2, and wherein each of said single partial product execution phase, said single carry save adder execution phase, and said single propagate adder phase [[may]] is operable to accept a new reduced-precision multiplication operation during a given execution cycle.

- 8. (Original) The floating point multiplier circuit as recited in claim 1, wherein said partial product generation logic includes a plurality of Booth encoders and a plurality of Booth multiplexers.
- 9. (Original) The floating point multiplier circuit as recited in claim 1, wherein M is equal to 2Y, wherein said first portion of said multiplier value includes the most significant Y bits of said multiplier value, and wherein said second portion of said multiplier value includes the least significant Y bits of said multiplier value.
- 10. (Currently amended) A method of operation of a multiplier circuit, comprising:
  - <u>said multiplier circuit</u> receiving an N-bit multiplicand value and an M-bit multiplier value;
  - said multiplier circuit generating a plurality of partial products from said multiplicand value and said multiplier value, wherein said plurality of partial products corresponds to a first portion of said multiplier value during a first partial product execution phase, and wherein said plurality of partial products further corresponds to a second portion of said multiplier value during a second partial product execution phase;
  - said multiplier circuit accumulating said plurality of partial products generated during said first partial product execution phase into a redundant product during a first carry save adder execution phase;

- said multiplier circuit accumulating said plurality of partial products generated during said second partial product execution phase into said redundant product during a second carry save adder execution phase;
- <u>said multiplier circuit</u> reducing a first portion of said redundant product to a multiplicative product during a first carry propagate adder phase; and
- <u>said multiplier circuit</u> reducing a second portion of said redundant product to said multiplicative product during a second carry propagate adder phase;
- wherein said first carry propagate adder phase begins after said second carry save adder execution phase completes.
- 11. (Currently amended) The method as recited in claim 10, further comprising:
  - said multiplier circuit performing an arithmetic left shift on said redundant product accumulated during said first carry save adder execution phase by a number of bits corresponding to said first portion of said multiplier value; and
  - said multiplier circuit accumulating a result of said arithmetic left shift with said second portion of said plurality of partial products into said redundant product during said second carry save adder execution phase.
  - 12. (Original) The method as recited in claim 10, wherein:
  - said first portion of said multiplier value corresponds to a higher-order portion of said multiplier value;

said second portion of said multiplier value corresponds to a lower-order portion of said multiplier value;

said first portion of said redundant product corresponds to a lower-order portion of said redundant product; and

said second portion of said redundant product corresponds to a higher-order portion of said redundant product.

13. (Original) The method as recited in claim 10, wherein:

said redundant product includes a Q-bit sum term and an R-bit carry term;

each of said first and second portion of said redundant product includes at most P bits; and

each of P, Q, and R is a positive integer, P is less than Q, and P is less than R.

14. (Currently amended) The method as recited in claim 10, further comprising:

said multiplier circuit receiving a plurality of rounding constants;

- <u>said multiplier circuit</u> accumulating each rounding constant with a first portion of said redundant product into a respective rounded redundant product during said first carry propagate adder phase;
- said multiplier circuit reducing a first portion of each said respective rounded redundant product to a given respective rounded multiplicative product during said first carry propagate adder phase; and

<u>said multiplier circuit</u> reducing a second portion of each said respective rounded redundant product to said given respective rounded multiplicative product during said second carry propagate adder phase.

15. (Currently amended) The method as recited in claim 10, further comprising said multiplier circuit selectively performing pipelined reduced-precision multiplication of said N-bit multiplicand value by an S-bit multiplier value with a single partial product execution phase, a single carry save adder execution phase, and a single carry propagate adder phase, wherein S is a positive integer and S is less than or equal to N/2, and wherein each of said single partial product execution phase, said single carry save adder execution phase, and said single propagate adder phase [[may]] is operable to accept a new reduced-precision multiplication operation during a given execution cycle.

16. (Original) A microprocessor comprising:

dispatch logic configured to issue multiply instructions to a floating-point unit; and

a floating-point unit coupled to said dispatch logic and configured to:

receive an N-bit multiplicand value and an M-bit multiplier value;

generate a plurality of partial products from said multiplicand value and said multiplier value, wherein said plurality of partial products corresponds to a first portion of said multiplier value during a first partial product execution phase, and wherein said plurality of partial products further corresponds to a second portion of said multiplier value during a second partial product execution phase;

- accumulate said plurality of partial products generated during said first partial product execution phase into a redundant product during a first carry save adder execution phase;
- accumulate said plurality of partial products generated during said second partial product execution phase into said redundant product during a second carry save adder execution phase;
- reduce a first portion of said redundant product to a multiplicative product during a first carry propagate adder phase; and
- reduce a second portion of said redundant product to said multiplicative product during a second carry propagate adder phase;
- wherein said first carry propagate adder phase begins after said second carry save adder execution phase completes.
- 17. (Original) The microprocessor as recited in claim 16, wherein:
- said plurality of carry save adders is further configured to perform an arithmetic left shift on said redundant product accumulated during said first carry save adder execution phase by a number of bits corresponding to said first portion of said multiplier value; and
- said plurality of carry save adders is further configured to accumulate a result of said arithmetic left shift with said second portion of said plurality of partial products into said redundant product during said second carry save adder execution phase.
- 18. (Original) The microprocessor as recited in claim 16, wherein:

said first portion of said multiplier value corresponds to a higher-order portion of said multiplier value;

said second portion of said multiplier value corresponds to a lower-order portion of said multiplier value;

said first portion of said redundant product corresponds to a lower-order portion of said redundant product; and

said second portion of said redundant product corresponds to a higher-order portion of said redundant product.

19. (Original) The microprocessor as recited in claim 16, wherein:

said redundant product includes a Q-bit sum term and an R-bit carry term;

said first carry propagate adder includes a plurality of operand inputs, wherein each operand input includes at most P bits; and

each of P, Q, and R is a positive integer, P is less than Q, and P is less than R.

- 20. (Original) The microprocessor as recited in claim 16 further comprising a plurality of rounding adders coupled to said plurality of carry save adders and configured to produce a respective plurality of rounded multiplicative products.
- 21. (Original) The microprocessor as recited in claim 20, wherein each rounding adder is further configured to:

receive a respective rounding constant;

accumulate said respective rounding constant with a first portion of said redundant product into a rounded redundant product during said first carry propagate adder phase;

reduce a first portion of said rounded redundant product to a given respective rounded multiplicative product during said first carry propagate adder phase; and

reduce a second portion of said rounded redundant product to said given respective rounded multiplicative product during said second carry propagate adder phase.

22. (Currently amended) The microprocessor as recited in claim 16, further configured for performing pipelined reduced-precision multiplication of said N-bit multiplicand value by an S-bit multiplier value with a single partial product execution phase, a single carry save adder execution phase, and a single carry propagate adder phase, wherein S is a positive integer and S is less than or equal to N/2, and wherein each of said single partial product execution phase, said single carry save adder execution phase, and said single propagate adder phase [[may]] is operable to accept a new reduced-precision multiplication operation during a given execution cycle.

23-25. (Canceled)

26. (New) A microprocessor comprising:

dispatch logic configured to issue multiply instructions to a floating-point unit; and

a floating-point unit coupled to said dispatch logic and configured to perform, during an extended-precision mode of operation, extended precision multiplication of an N-bit multiplicand value and an M-bit multiplier value, where N and M are positive integers and M is greater than N/2;

wherein said floating-point unit is further configured to perform, during a reduced-precision mode of operation, pipelined reduced-precision multiplication of said N-bit multiplicand value by an S-bit multiplier value, wherein S is a positive integer and S is less than or equal to N/2;

wherein during said extended-precision mode of operation, said floating-point unit is configured to produce extended precision products with a maximum throughput of one extended precision product produced every other cycle; and

wherein during said reduced-precision mode of operation, said floating-point unit is configured to produce reduced precision products with a maximum throughput of one product produced every cycle.

27. (New) The microprocessor as recited in claim 26,

wherein during said extended-precision mode of operation, said floating-point unit is further configured to:

generate a plurality of partial products from said multiplicand value and said multiplier value, wherein said plurality of partial products corresponds to a first portion of said multiplier value during a first partial product execution phase, and wherein said plurality of partial products further corresponds to a second portion of said multiplier value during a second partial product execution phase;

accumulate said plurality of partial products generated during said first partial product execution phase into a redundant product during a first carry save adder execution phase;

accumulate said plurality of partial products generated during said second partial product execution phase into said redundant product during a second carry save adder execution phase;

reduce a first portion of said redundant product to a multiplicative product during a first carry propagate adder phase; and

reduce a second portion of said redundant product to said multiplicative product during a second carry propagate adder phase;

wherein said first carry propagate adder phase begins after said second carry save adder execution phase completes; and

wherein during said reduced-precision mode of operation, said floating-point unit is further configured to perform pipelined reduced-precision multiplication of said N-bit multiplicand value by said S-bit multiplier value with a single partial product execution phase, a single carry save adder execution phase, and a single carry propagate adder phase, wherein each of said single partial product execution phase, said single carry save adder execution phase, and said single propagate adder phase is operable to accept a new reduced-precision multiplication operation during any given execution cycle.

28. (New) The microprocessor as recited in claim 26, wherein during said extended-precision mode of operation, each extended precision multiplication operation has a fixed latency that is independent of whether result rounding occurs or a denormal result is produced.